Approximating persistent homology for a cloud of $n$ points in a subquadratic time
نویسنده
چکیده
The Vietoris-Rips filtration for an n-point metric space is a sequence of large simplicial complexes adding a topological structure to the otherwise disconnected space. The persistent homology is a key tool in topological data analysis and studies topological features of data that persist over many scales. The fastest algorithm for computing persistent homology of a filtration has time O(M(u) + u2 log u), where u is the number of updates (additions or deletions of simplices), M(u) = O(u2.376) is the time for multiplication of u× u matrices. For a space of n points given by their pairwise distances, we approximate the Vietoris-Rips filtration by a zigzag filtration consisting of u = o(n) updates, which is sublinear in n. The constant depends on a given error of approximation and on the doubling dimension of the metric space. Then the persistent homology of this sublinear-size filtration can be computed in time o(n2), which is subquadratic in n. 1 Our contributions and related work The aim of topological data analysis is to understand the shape of unstructured data often given as finitely many points in a metric space. Usually, the shape of such a point cloud is studied through a filtration of complexes built on given points. For instance, the Vietoris-Rips complex contains edges, triangles, tetrahedra spanned by points whose pairwise distances are less than a certain scale. The persistent homology of the resulting filtration over all scales captures topological features that persist over a long time interval. 1 ar X iv :1 31 2. 14 94 v1 [ cs .C G ] 5 D ec 2 01 3 If given points are densely sampled from a compact set in R, the VietorisRips complex at a certain scale correctly represents the topology of the set [1]. For a cloud of n points, the Vietoris-Rips complex may contain up to O(n) simplices in dimension l, so this large size is the main drawback. Don Sheehy [10] recently approximated the full filtration of Vietoris-Rips complexes on n points in a metric space by a filtration that has a size O(n) and approximates the persistent homology with a multiplicative error close to 1. The Sheehy-Vietoris-Rips complex uses a net-tree [9] as a black box. If we run the best algorithm [7, 4] for persistent homology on the Sheehy approximation to the Vietoris-Rips filtration, the overall running time for approximating persistent homology will be O(n). This overquadratic time is a bottleneck, but allows us to replace a sophisticated construction of a net-tree by a simpler algorithm for k-farthest neighbors in a metric space. Problem 1.1. For a cloud of n points in a metric space, approximate the persistent homology of the Vietoris-Rips filtration in a subquadratic time o(n). We solve Problem 1.1 in Theorem 1.2 by building a sublinear-size approximation to the Vietoris-Rips filtration on n given points in a metric space and then running the best algorithm for computing the zigzag persistent homology. Due to stability of persistent homology [3], the error of approximation at the homology level can be controlled at the level of filtration. Theorem 1.2. The Vietoris-Rips filtration has a sublinear-size approximation that leads to a simple o(n) time algorithm for approximating persistent homology of the Vietoris-Rips filtration on n points in a metric space. The running time also depends on the error of approximation and on the doubling dimension of the metric space, see Proposition 4.10. Our algorithm can improve the filtration on the fly without starting from scratch to get a smaller error of approximation and at a higher computational cost. 2 Basic definitions and auxiliary results Definition 2.1. In a metric space (M,D) with a distance D : M ×M → R, the (closed) ball with a center c ∈ M and a radius r > 0 is B(c; r) = {a ∈ M | D(a, c) ≤ r}. The doubling constant λ of (M,D) is the minimum number of balls of a radius r that can cover any ball of radius 2r. The doubling dimension of (M,D) is dim = dlog2 λe, so λ ≤ 2. If a metric space (M,D) is finite, then the spread Φ (or the aspect ratio) is the ratio of the largest to smallest interpoint distances D(a, b) over all distinct a, b ∈M . Definition 2.1 implies that any subspace of a finite metric space (M,D) with a doubling dimension dim has a doubling dimension at most dim.
منابع مشابه
Approximating Local Homology from Samples
Recently, multi-scale notions of local homology (a variant of persistent homology) have been used to study the local structure of spaces around a given point from a point cloud sample. Current reconstruction guarantees rely on constructing embedded complexes which become difficult in high dimensions. We show that the persistence diagrams used for estimating local homology, can be approximated u...
متن کاملError bounds in approximating n-time differentiable functions of self-adjoint operators in Hilbert spaces via a Taylor's type expansion
On utilizing the spectral representation of selfadjoint operators in Hilbert spaces, some error bounds in approximating $n$-time differentiable functions of selfadjoint operators in Hilbert Spaces via a Taylor's type expansion are given.
متن کاملWhen Crossings Count — Approximating the Minimum
We present an (1+ε)-approximation algorithm for computing the minimum-spanning tree of points in a planar arrangement of lines, where the metric is the number of crossings between the spanning tree and the lines. The expected running time of the algorithm is near linear. We also show how to embed such a crossing metric of hyperplanes in d-dimensions, in subquadratic time, into high-dimensions s...
متن کاملEfficient and Robust Persistent Homology for Measures
A new paradigm for point cloud data analysis has emerged recently, where point clouds are no longer treated as mere compact sets but rather as empirical measures. A notion of distance to such measures has been defined and shown to be stable with respect to perturbations of the measure. This distance can easily be computed pointwise in the case of a point cloud, but its sublevel-sets, which carr...
متن کاملPersistent Homology Over Directed Acyclic Graphs
We define persistent homology groups over any set of spaces which have inclusions defined so that the underlying graph between the spaces is directed and acyclic. This method simultaneously generalizes standard persistent homology, zigzag persistence and multidimensional persistence to arbitrary directed acyclic graphs, and it also allows the study of arbitrary families of topological spaces or...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- CoRR
دوره abs/1312.1494 شماره
صفحات -
تاریخ انتشار 2013